Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use integer time throughout codebase #73

Merged
merged 15 commits into from
Oct 10, 2023
Merged

Use integer time throughout codebase #73

merged 15 commits into from
Oct 10, 2023

Conversation

spenczar
Copy link
Contributor

@spenczar spenczar commented Oct 2, 2023

This is a big PR. I'm relying very heavily on tests, and there are probably spots that I have missed.

I'll try to highlight areas of special concern.

@spenczar spenczar requested a review from moeyensj October 2, 2023 23:10
cartesian.time.to_astropy().tdb.mjd,
cartesian.time.to_numpy(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've decided to add to_numpy as a convenience method for the conversion to TDD-scaled, MJD-epoched, double-precisioned timestamps. Perhaps the name is opaque?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to_numpy is a good addition. Though I agree that the name is a little opaque. Do you think it might make sense to add a scale kwarg that will return the numpy array in the desired scale (to_numpy(scale="tdb"))? The default can be set to its current scale?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option (now having read more of the code) might be to enable symmetry between .from_mjd by adding a to_mjd(scale="tdb"). The current mjd() could suffice as well with a scale parameter. The latter could just return a pyarrow array (the additional call to .to_numpy() from there doesn't seem super cumbersome from my perspective).

I suspect a lot of astronomers will expect an interface similar to astropy. Don't know how much we want to mimic it but that's a thought. This doesn't have to be addressed in this PR obviously. Here are some options:

import numpy as np
from adam_core.time import Timestamp

mjd_tdb = np.arange(59000, 60000)

times = Timestamp.from_mjd(mjd_tdb, scale="tdb") # this feels very nice and not much different than astropy
mjd_utc = times.utc().mjd().to_numpy() 
mjd_utc = times.mjd(scale="utc").to_numpy() 
mjd_utc = times.to_numpy(scale="utc")
mjd_utc = times.to_mjd(scale="utc") # the symmetry between .from_mjd() and .to_mjd() feels quite intuitive

@@ -153,6 +153,25 @@ def is_all_nan(self) -> bool:
"""
return np.all(np.isnan(self.to_matrix()))

@classmethod
def nulls(cls, length: int) -> "CoordinateCovariances":
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is helpful when creating a "null" covariances which can be written to and read from Parquet, because apache/arrow#35692 remains an issue.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote something similar and it's been super useful. This is great!

Comment on lines 166 to 168
time=Timestamp.from_astropy(
Time(t1_, scale="tdb", format="mjd"),
),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a lot of Timestamp.from_astropy(astropy.Time(...)) calls littered throughout the codebase. Those should be shrinkable down to just Timestamp(...) in some way - possibly a new direct constructor. For example here it would be `Timestamp.from_mjd(t1_, scale="tdb").

I could do those now, if that seems useful.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are feeling inspired, then yes absolutely! All the .from_astropy(Time()) should go since it's pretty silly. I like .from_mjd(scale=...), it's explicit and very convenient.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, definitely.

Comment on lines +49 to +51
@pytest.mark.parametrize("code", ["X05", "I41", "W84"])
@pytest.mark.parametrize("origin", ["sun", "ssb"])
def test_get_observer_state(code, origin):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file looks like it got a lot of changes, but really, this is just parametrization to reduce the repetition. It doesn't do anything novel.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much better this way.

@@ -0,0 +1,48 @@
import numpy as np
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This file's contents come from the README, more or less.

"jd1": orbits.coordinates.time.jd1,
"jd2": orbits.coordinates.time.jd2,
"day": orbits.coordinates.time.days,
"millis": orbits.coordinates.time.millis(),
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a consequential choice - what's the linking precision for these variants? I figured milliseconds is good enough, but maybe micros? Maybe seconds? Maybe nanos? You tell me.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to define the link precision as a kwarg in this function? When variants are created they will inherit the time of their parent in which case nanos (or whatever the smallest is) would be appropriate. Once we do something to those variants, say we propagate them with PYOORB to some time. There is no guarantee that PYOORB doesn't return a slightly different time for each variant (especially if they might be propagated in different processes where the stepsize might change for that particular chunk of orbits). In this case, I don't know really what to expect.

Maybe let's set it to nanos and be aggressive? If we bump into linkages that have 0 members in one of the tables then we revisit the constraints?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the plus side, this is the beauty of this Timestamp class. We can define those link precisions and that is super nice. I think its definitely a point worth highlighting when we demo the code to others.

Copy link
Member

@moeyensj moeyensj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early review comments and please push back where necessary.

One thing to check so we avoid continually reverting/redoing changes to the test data is if your astropy cache is affecting the test data diffs.

The only other "major" thing is the interface of the Timestamp class. This will likely replace our use of astropy time so it would be good I think to come up with an API we are happy with. I think astronomers will expect to see something astropy-esque. Having that level of convenience and ease of use would be nice. We may want to consult @mjuric, or better maybe we get you to do a quick show and tell in the DiRAC Solar System meeting to get opinions there.

Lastly, I think we should go "all-in" and change the interfaces of our primary functions (the ones in test_imports.py that are not jax-jitted like propagator.propagate_orbits) to accept Timestamp objects and not astropy Time objects. This could be done in a separate PR if needed but it would inform how much we like the interface to the Timestamp class.

cartesian.time.to_astropy().tdb.mjd,
cartesian.time.to_numpy(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think to_numpy is a good addition. Though I agree that the name is a little opaque. Do you think it might make sense to add a scale kwarg that will return the numpy array in the desired scale (to_numpy(scale="tdb"))? The default can be set to its current scale?

@@ -153,6 +153,25 @@ def is_all_nan(self) -> bool:
"""
return np.all(np.isnan(self.to_matrix()))

@classmethod
def nulls(cls, length: int) -> "CoordinateCovariances":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote something similar and it's been super useful. This is great!

Comment on lines 166 to 168
time=Timestamp.from_astropy(
Time(t1_, scale="tdb", format="mjd"),
),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you are feeling inspired, then yes absolutely! All the .from_astropy(Time()) should go since it's pretty silly. I like .from_mjd(scale=...), it's explicit and very convenient.

Comment on lines +49 to +51
@pytest.mark.parametrize("code", ["X05", "I41", "W84"])
@pytest.mark.parametrize("origin", ["sun", "ssb"])
def test_get_observer_state(code, origin):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's much better this way.

adam_core/utils/helpers/data/elements_ssb_ec.csv Outdated Show resolved Hide resolved
"jd1": orbits.coordinates.time.jd1,
"jd2": orbits.coordinates.time.jd2,
"day": orbits.coordinates.time.days,
"millis": orbits.coordinates.time.millis(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we might want to define the link precision as a kwarg in this function? When variants are created they will inherit the time of their parent in which case nanos (or whatever the smallest is) would be appropriate. Once we do something to those variants, say we propagate them with PYOORB to some time. There is no guarantee that PYOORB doesn't return a slightly different time for each variant (especially if they might be propagated in different processes where the stepsize might change for that particular chunk of orbits). In this case, I don't know really what to expect.

Maybe let's set it to nanos and be aggressive? If we bump into linkages that have 0 members in one of the tables then we revisit the constraints?

"jd1": orbits.coordinates.time.jd1,
"jd2": orbits.coordinates.time.jd2,
"day": orbits.coordinates.time.days,
"millis": orbits.coordinates.time.millis(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the plus side, this is the beauty of this Timestamp class. We can define those link precisions and that is super nice. I think its definitely a point worth highlighting when we demo the code to others.

adam_core/time/interval.py Outdated Show resolved Hide resolved
cartesian.time.to_astropy().tdb.mjd,
cartesian.time.to_numpy(),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option (now having read more of the code) might be to enable symmetry between .from_mjd by adding a to_mjd(scale="tdb"). The current mjd() could suffice as well with a scale parameter. The latter could just return a pyarrow array (the additional call to .to_numpy() from there doesn't seem super cumbersome from my perspective).

I suspect a lot of astronomers will expect an interface similar to astropy. Don't know how much we want to mimic it but that's a thought. This doesn't have to be addressed in this PR obviously. Here are some options:

import numpy as np
from adam_core.time import Timestamp

mjd_tdb = np.arange(59000, 60000)

times = Timestamp.from_mjd(mjd_tdb, scale="tdb") # this feels very nice and not much different than astropy
mjd_utc = times.utc().mjd().to_numpy() 
mjd_utc = times.mjd(scale="utc").to_numpy() 
mjd_utc = times.to_numpy(scale="utc")
mjd_utc = times.to_mjd(scale="utc") # the symmetry between .from_mjd() and .to_mjd() feels quite intuitive

@moeyensj
Copy link
Member

moeyensj commented Oct 3, 2023

I forgot to say this in the review: These are excellent changes and I'm really excited to be able to move away from insane floating point time madness.

@@ -19,7 +18,7 @@ class Exposures(qv.Table):
"""

id = qv.StringColumn()
start_time = Times.as_column()
start_time = Timestamp.as_column()
duration = qv.Float64Column(validator=and_(ge(0), le(3600)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if switching this to an integer (nanoseconds?) would be worth the inconvenience?

Convenience method for converting to MJD TDB as an array of numpy
float64s
Use parquet instead of CSV for data files because it can stored
fixed-length lists. Add utilties for creating CoordinateCovariances
which are full of nulls, since that's necessary when reading fixed
length lists with null data out of parquet.

Condense and parametrize the test case for observer state computation.

Obey lint
@spenczar spenczar merged commit 461087a into main Oct 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants